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Abstract 


An autostereogram is a single image that encodes 
depth information that pops out when looking at it. 
The trick is achieved by replicating a vertical strip 
that sets a basic two-dimensional pattern with dis¬ 
parity shifts that encode a three-dimensional scene. 
It is of interest to explore the dependency between the 
ease of perceiving depth in autostereograms and the 
choice of the basic pattern used for generating them. 
In this work we confirm a theory proposed by [5] to 
explain the process of autostereographic depth per¬ 
ception, providing a measure for the ease of “locking 
into” the depth profile, based on the spectral prop¬ 
erties of the basic pattern used. We report the re¬ 
sults of three sets of psychophysical experiments us¬ 
ing autostereograms generated from two-dimensional 
random noise patterns having power spectra of the 
form 1//^. The experiments were designed to test 
the ability of human subjects to identify smooth, low 
resolution surfaces, as well as detail, in the form of 
higher resolution objects in the depth profile, and 
to determine limits in identifying small objects as a 
function of their size. In accordance with the the¬ 
ory, we discover a significant advantage of the 1/f 
noise pattern (pink noise) for fast depth lock-in and 
fine detail detection, showing that such patterns are 
optimal choices for autostereogram design. Validat¬ 
ing the theoretical model predictions strengthens its 
underlying assumptions, and contributes to a better 
understanding of the visual system’s binocular dis¬ 
parity mechanisms. 


1 Introduction 

While the world around us is three-dimensional, the 
visual data is inherently two-dimensional. Neverthe¬ 
less, the third dimension can often be inferred from 
one or more images, utilizing cues such as occlusion, 
size, texture, lighting and shading, prior shape infor¬ 
mation etc. [22]. Stereo vision provides binocular 
cues such as the vergence angle, formed between the 
axes from the eyes to the convergence point on which 
both eyes fixate, and, most importantly, binocular 
disparity, which is the horizontal displacement be¬ 
tween matching features in the images acquired by 
pairs of eyes (or cameras). 

The seminal work of Julesz [13] established that 
depth information can be retrieved by determining 
correspondences of matching local features. The cor¬ 
respondence problem is widely studied and over the 
years many models for stereopsis have been proposed 
(e.g. [soiiiaiMiiiiiiiiEiiEaiiiiiis^ Julesz [13] 
further showed that depth can be perceived from dis¬ 
parity alone, without any other visual cues, by creat¬ 
ing Random-Dot-Stereograms^ which are pairs of sim¬ 
ilar images consisting of randomly placed dots in the 
plane, one of them having part of the dots slightly 
displaced to encode depth. 

The phenomenon of seeing illusory depth in repeat¬ 
ing patterns was revealed even earlier by Brewster [4] 
in what became known as the wallpaper effeet. Ittel- 
son [12] also reported that effect, observing that wall 
surfaces with repeating patterns appear to move for¬ 
ward after long stares, and letters on a typewriter 
keyboard sometimes perceptually merge into one. 

Tyler et al. isniEi] further studied this idea and 
discovered that depth can also be encoded in a sin- 
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gle random dot image by cleverly producing “stereo 
pairs” that are identical, hence effectively fusing them 
into one image, called a Single-Image Random-Dot 
Stereogram (SIRDS) or more generally an Autostere¬ 
ogram. 

An autostereogram is a periodic pattern with hor¬ 
izontal deformations modulated by a spatial depth 
function, such that pixel values are a function of the 
constant distance between the eyes and the encoded 
scene’s local depth. Corresponding pixels, i.e. pixels 
which originate at the same point in the depth scene, 
are given the same color or gray-level value, and being 
seen identical by the two eyes, they will potentially 
be matched in the binocular viewing process. 

Autostereograms became widespread as a very 
popular art form due to a series of books titled 
“Magic Eye” m and were proposed for applications 
like 3D photography and computer graphics. Since 
they can be viewed without special equipment and 
are easily manipulated to create various visual effects 
and illusions, autostereograms can serve as an im¬ 
portant tool in the study of depth perception and in 
binocular vision research in general (see e.g. j3^[TQ] b 

As part of the research efforts on autostereograms, 
Tyler and Clarke m also invented more complex au¬ 
tostereograms capable of encoding multiple depths. 
This concept was further examined by m- 

Thimbleby et al. [28] proposed a computational 
algorithm for autostereograms generation based on 
geometrical constraints. Over the years, other al¬ 
gorithms for autostereogram generation were sug¬ 
gested such as those by Minh et al. [20l [T9| and by 
Geselowitz [9]. Some research also dealt with depth- 
map reconstruction from autostereograms mnsj. 

Only few studies investigated the conditions for 
which autostereogram viewing is easier. Ditzinger 
et al. [8] found that random noise added to the re¬ 
peated patterns can improve depth perception and 
reduce hysteresis effects. The work of Bruckstein et 
al. [5] is the only one to deal with the effect of the 
choice of basic pattern and to develop a theoretical 
framework for analyzing the ease of depth perception 
in autostereograms with respect to the underlying 
noise pattern used for the autostereogram generation. 
Their model predicts that autostereograms created 
from pink noise patterns (having a power spectrum 


of 1//^ ; /d = 1) should be more easily perceived than 
those created using other noise patterns, as detailed 
in the Appendix. Whereas depth “lock-in” is antici¬ 
pated to break at fine scales for > 1 and at coarse 
scales for < 1, pink noise leads to scale invariant 
match functions and is therefore optimal for locking 
in and maintaining the depth perception effect across 
scales. 

This theoretical result nicely resonates with studies 
on natural images, that found that such images tend 
to have power spectra of 1// [7[ [29] , suggesting that 
our eyes may be adapted by evolution to optimally 
perceive such patterns. 

Though mathematically well established, the 
model suggested by [5] has never been experimentally 
tested. The current work aims to further understand 
the autostereogram depth perception mechanism by 
testing and verifying this model using psychophysical 
experiments. 


2 Methods 

In order to test and confirm the prediction of the 
model suggested by [5], we conducted three sets of 
psychophysical experiments using autostereograms. 
In the first two experiments we evaluated the effect 
of different noise patterns on perception of low res¬ 
olution surfaces and higher resolution objects (letter 
profiles) in the depth dimension. In the third experi¬ 
ment we explored the effect of different noise patterns 
on the limits of identifying small objects in the depth 
dimension as a function of their size. 


2.1 Subjects 

Fifteen participants between the ages of 20 and 34 
took part in the experiments. All the participants 
had normal or corrected-to-normal vision, and all 
were first tested for their ability to perceive depth 
hidden in autostereograms. The participants were 
unaware of the purpose of the study. In the third 
experiment, only seven of the original fifteen partici¬ 
pants took part. 
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2.2 Stimuli 


The stimuli in all our experiments are autostere¬ 
ograms that were generated from pre-designed depth 
maps and a desired noise spectrum according to the 
algorithm described below: 

First, a basic noise patch of 128 x 128 pixels having 
a noise spectrum of 1//^ was stochastically generated 
for each autostereogram. The patch was then up- 
sampled by a factor of 2 to create a 256 x 256 block 
with basic “pixels” of size 2x2. Up-sampling was 
performed to enable easier viewing of the autostere¬ 
ograms on large monitors. The up-sampled patch was 
replicated vertically to create a strip of 256 x 1024 pix¬ 
els, and the strip was horizontally replicated 6 times 
based on the viewing geometry and the given depth 
map, similarly to the process described in [28], re¬ 
sulting in the final autostereogram sized 1536 x 1024. 

Next, we shall describe the depth maps designed 
for each experiment. 




(c) 


(d) 


Figure 1: Depth maps used for experiment 1: 
Egg crate, [(1^ Diagonal sinus wave, (c) Ellipsoid, 
Mexican hat 


(a) 


(d) 


2.2.1 Experiment 1 - Surface Recognition 

Low resolution in the depth dimension was repre¬ 
sented by depth maps of four different smooth pro¬ 
files. These depth profiles were designed using con¬ 
tinuous functions so as to avoid occlusions, miss- 
correspondences and echoes in the creation of the 
autostereogram [28] and were considerably different 
from one another in order to avoid confusion in their 
identification (see Eigurej^. 

The autostereograms were constructed using five 
noise patterns having a spectrum of 1//^ with P = 
0, ^,1,|,2. Examples of the generated autostere¬ 
ograms for the different noise patterns are depicted 
in Eigure|^ 

A collection of 140 autostereograms was con¬ 
structed for the experiment: each surface was used 
7 times for each type of noise pattern. 

2.2.2 Experiment 2 - Detail Discrimination 

High resolution detail in the depth dimension was 
created by using the depth profiles of four possible 
letters superimposed on the smooth surface of an el¬ 
lipsoid (Eignre[T^. 


Since the detail in depth rides on top of a smooth 
ellipsoidal surface, we in fact test the identification 
of higher resolution objects in the presence of a 
smooth, low resolution background. With that 
regard, it should be noted that without the ellip¬ 
soid background, a ripple artifact appeared in the 
generated autostereograms, which could have led 
to the hidden letter being identified even without 
perceiving the depth dimension. 

There were four possible detail depth-profiles in the 
shape of the letters S, X, L, and T. In addition, there 
was a fifth option of no letter present, serving as a 
control option to check whether participants actually 
perceive the detail. The letters were 240 x 240 pix¬ 
els in size and were placed vertically in the middle 
of the surface with horizontal displacements of up to 
400 pixels to the left or right. Those horizontal dis¬ 
placements were used so that participants wouldn’t 
fixate on a single location in the autostereograms but 
instead search for the letter in the proximity of the 
center, while still placing the letters on a smooth 
background slope. 
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(e) 


Figure 2: Exampl es of autostereograms of t he e gg crat e dep th map (Figure [l^ with differ ent noise patterns 


of the form 1//^: (a) /3 = 0 (white noise), {h) (3 = {c) (3 = 1 (pink noise) 

noise). 


|(d)|/^ = |, (e) /3 = 2 (brown 
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The background surface was normalized to occupy 
60% of the gray level range, and the letters (when 
added) were scaled to 10% of that range, i.e. higher 
than the highest point of the surface at a ratio of 
1/6. After incorporating the letters, a 5 x 5 low-pass 
filter was applied to smooth the letter boundaries, in 
order to avoid miss-correspondence issues that could 
result in echoes in the generated autostereogram [28] . 
The letters were selected to be considerably different 
from one another in order to avoid confusion in their 
identification. 

As in the first experiment, the autostereograms 
were generated using five noise patterns associated 
with /3 = 0, |,1,|,2. A collection of 125 autostere¬ 
ograms was constructed for the experiment: each let¬ 
ter, or lack thereof, was used 5 times in conjunction 
with each noise pattern. 

Examples of depth maps with letters incorporated 
are presented in Figure 



(c) (d) 


Figure 3: Examples of depth maps used for experi¬ 
ment 2 


2.2.3 Experiment 3 - Fine Detail Identifica¬ 
tion Limits 

Similar to the second experiment, high resolution in 
the depth dimension was represented by different let¬ 
ters superimposed on an ellipsoid surface (Figure [T^. 
The letters selected for this experiment were P and B. 
This time the letters were deliberately chosen to be 
not considerably different from one another in order 
to check for identification accuracy, and as opposed 
to the second experiment, there is always a letter in¬ 
serted. In order to reach the limits of identification, 
the letter sizes were smaller: 20 x 20, 40 x 40, 60 x 60, 
80 X 80, or 100 x 100 pixels. 

The background surface was again normalized to 
occupy 60% of the gray level range, and the letters 
were scaled to 12%, 10% or 8.57% of that range, i.e. 
higher than the highest point of the surface at a ratio 
of 1/5,1/6 or 1/7. After incorporating the letters, 
here too a 5 x 5 low-pass filter was applied to smooth 
the resulting surface. 

The autostereograms were constructed using three 
noise patterns associated with /3 = 0, |,1. A col¬ 
lection of 180 autostereograms was generated for the 
experiment: each letter was used twice for each com¬ 
bination of letter size and relative depth and for each 
noise pattern. 

Examples of the depth maps with incorporated let¬ 
ters used for the identification limit experiment are 
presented in Figure 


B 


(a) (b) 

Figure 4: Examples of depth maps used for experi¬ 
ment 3: I (a) I The letter P (100 x 100 pixels), \(h)\ The 
letter B (80 x 80 pixels) 
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2.3 Procedure 


The experiments were conducted in an isolated room 
at the Intelligent Systems Lab (ISL) at the Technion. 
All the participants used the same computer and 
screen, with the same room illumination, and were 
shown the same autostereograms (in randomized or¬ 
der). The autostereograms were displayed in full¬ 
screen mode on a 22” LCD monitor with a 1680 x 1050 
resolution. 

The Psychophysics Toolbox Version 3 for MAT- 
LAB [3115] was used for displaying the autostere¬ 
ograms and for collecting the results. 

The procedure was as follows: An autostereogram 
was picked from the randomly ordered collection and 
displayed on the screen. The participants, their hand 
on the computer mouse at all time, had to press the 
left mouse button when perceiving the hidden sur¬ 
face. A selection screen then appeared offering a 
choice between all the possible depth maps (surfaces 
or letters) and an “undefinable” option. After se¬ 
lecting an answer, a new autostereogram would be 
displayed and so forth until completing the set. 

The selection screen, besides collecting the results, 
resets the focus and convergence that were previously 
achieved by the participants [25], thereby ensuring 
that the adjustment process starts from the same 
point each time. The selection screens used for the 
different experiments are presented in Figure 

The response time (RT) from the appearance of the 
autostereogram to the left button mouse click signal¬ 
ing identification was measured and the selection was 
checked for correctness. 

Since there was no time limit on the identification 
of the hidden object, participants were instructed to 
“give up” after a long period of time and choose the 
“undefinable” option in the selection screen. 

In each experiment, participants were first shown a 
training set of 10 autostereograms randomly chosen 
from the experiment’s autostereograms pool so they 
will familiarize themselves with the test environment, 
calibrate their viewing position to their optimum and 
have a sense of when they need to “give up”. Af¬ 
ter the training set, a white screen appeared and the 
participants were instructed to press the left mouse 
button when ready to proceed to the actual test. 


Click on the viewed object: (When finished press NEXT) 

HSSn 



Next ^ 


(a) Experiment 1 



(b) Experiment 2 


Click on the viewed object: (When finished press NEXT) 






(c) Experiment 3 


Figure 5: Selection screens used for the different ex¬ 
periments 
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3 Results and Discussion 


Participants were tested for accuracy and response 
time. Accuracy was measured by the percentage of 
correct answers and by the numbers of mistakes and 
choices of “undefinable” for each noise. A mistake 
was counted when the selection was neither “unde- 
finable” nor the correct answer. The mean and stan¬ 
dard deviation (STD) of the measured response times 
were computed using only the correct responses. The 
two samples one-tailed t-test was used to evaluate 
whether the mean response times obtained for differ¬ 
ent noise patterns have statistical significance. 


3.1 Experiment 1 

Table □ shows the rate of correct answers and num¬ 
ber of mistakes or choices of “undefinable” versus 
the noise pattern used. All noise types exhibit high 
correctness rate, indicating that smooth depth maps 
are easily recognizable across all of the mentioned 
noise patterns. Yet a slightly lower accuracy can be 
observed for white noise, due to its relatively high 
amount of “undefinable” selections. 

Since no time limit was posed on identification, the 
accuracy is very high for all the noise patterns, and 
so we examined the differences in response times. 

Figure [^presents the mean response time (RT) ver¬ 
sus noise pattern. While the mean RT is typically 
around 2 seconds, in 8 of the 2100 samples (which 
constitute 0.38%) RTs over 10 seconds and upto 76 
seconds were measured. Therefore we consider those 
samples as outliers and exclude them from our anal¬ 
ysis. 

A one-tailed t-test was performed on the response 
times of every pair of noise patterns to test whether 
the mean response times have statistical significance. 
The results of the t-test are presented in Table 

It can be observed that in accordance with [5], 
smooth surfaces hidden in autostereograms made of 
white noise patterns are significantly harder to per¬ 
ceive than with any other noise pattern. The best 
results are obtained for /3 = |, with non-significant 
difference from performance with 1 < /3 < 2. 



Figure 6: Mean Response Time vs. Noise Patterns in 
the Surface Recognition Test (Experiment 1). Error 
bars show standard error of the mean. 


Hypothesis 

Significant 

(P-value<0.05) 

P-value 

RTp=o > RT^^i 

yes 

2.954e-4 

RT^=o > RTj3=i 

yes 

5.68e-6 

RTp=o > RTf^^s 

yes 

9.858e-9 

RT^=o > RT/3=2 

yes 

3.656e-7 

RT^i > RTp=i 

no 

0.1389 

> -^^3=2 

yes 

0.0063 

RTj^^i > RTf^=2 

yes 

0.0338 

RTf3=i > RTp^ 

no 

0.0794 

RTp=i > RTf3=2 

no 

0.216 

< RT^=2 

no 

0.2862 


Table 2: T-Test Results for the Surface Recognition 
Test (Experiment 1) 
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White Noise 

13 = 0 

/? = ! 

Pink Noise 
/3 = 1 

II 

toico 

Brown Noise 

13 = 2 

Correct Rate 

99.05% 

99.76% 

99.52% 

99.29% 

99.52% 

Number of Mistakes 

1 

1 

2 

2 

1 

Number of Undefinables 

3 

0 

0 

1 

1 


Table 1: Accuracy vs. Noise Patterns in the Surface Recognition Test (Experiment 1) 


3.2 Experiment 2 

Table i shows the rate of correct answers and num¬ 
ber of mistakes or choices of “undefinable” versus the 
noise pattern used in experiment 2. All the noise 
patterns exhibit high accuracy in letter recognition, 
with a slight deterioration observed for brown noise 
W = 2). 

Figure [^presents the mean and standard error of 
the response time versus noise pattern. T-test results 
are presented in Table 



Figure 7: Mean Response Time vs. Noise Patterns 
in the Detail Discrimination Test (Experiment 2) 

It can be observed that the best performance is 
obtained for noise spectra with ^ < /3 < | with non¬ 
significant difference between them. Brown noise is 
inferior to all other noise patterns in terms of both ac¬ 
curacy and response time. White noise exhibits high 
response time, presumably due to the amount of time 


Hypothesis 

Significant 

(P-value<0.05) 

P-value 

RTp=o > 

yes 

0.0162 

RTp=o > RTp=i 

yes 

0.0059 

00|(M 

II 

A 

o 

II 

yes 

0.0152 

RTi3=o < RT^=2 

no 

0.3938 

RTj^^i > RT^=i 

no 

0.3689 


no 

0.4902 

RTj^^i < RTj3=2 

yes 

0.0144 

RTf3=i < RTp 3 

no 

0.356 

RT^=i < RT^=2 

yes 

0.0059 

RTi3=3 < RTj3=2 

yes 

0.0137 


Table 4: T-Test Results for the Detail Discrimination 
Test (Experiment 2) 

it takes to first identify the smooth background (ex¬ 
periment 1), but high correctness rate with no mis¬ 
takes made by the participants. 

3.3 Experiment 3 

The results of the previous experiment indicate that 
pink noise is superior to white noise in terms of re¬ 
sponse time in identification of high resolution de¬ 
tails. However, in order to assure that this didn’t 
result from the incorporated letters being too big, 
the third experiment compares these noises again for 
significantly smaller letters, where white noise is ex¬ 
pected to have an advantage over the other noise pat¬ 
terns. Brown noise {/3 = 2) wasn’t included in this 
experiment as it is expected to perform worst for fine 
detail identification. 

Since our aim in this experiment was testing per¬ 
ception limits, identification was not limited in time 
and we focused our analysis mainly on the identifica- 













































White Noise 

13 = 0 

/3=l 

Pink Noise 
/3 = 1 

II 

toico 

Brown Noise 

13 = 2 

Correct Rate 

100% 

99.47% 

100% 

99.47% 

97.07% 

Number of Mistakes 

0 

0 

0 

0 

6 

Number of Undefinables 

0 

2 

0 

2 
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Table 3: Accuracy vs. Noise Patterns in the Detail Discrimination Test (Experiment 2) 


tion accuracy. 

The correct rate as a function of the letter size is 
presented in Figure Data was accumulated and 
averaged for 3 different relative depths of the letters 
with respect to the background map. 



Figure 8: Correct Rate vs. Letter Size in the identi¬ 
fication limit test (Experiment 3) 

Table displays the percentage of correct answers, 
mistakes and “undefinable” choices per noise, calcu¬ 
lated for all participants as a function of both letter 
size and relative depth. 

It can be observed that for all the tested noises, 
accuracy increases with letter size. However, its de¬ 
pendence on the relative depth seems quite random. 
A previous experiment we performed, testing a wider 
range of relative depths, also didn’t reveal any clear 
dependency. The letter’s size is therefore much more 
significant to its correct identification than its rela¬ 
tive depth. 

On a one-tailed t-test we found no significance 


in accuracy between the different noise patterns. 
The lack of significance indicates that pink noise 
performs comparably well for the highest resolutions, 
as predicted by [5]. 

Figure [^presents the mean response time as a func¬ 
tion of the letter size. For all noises, response time de¬ 
creases with letter size. Our results therefore coincide 
with the statement made by m that time required 
for stereopsis increases as the size of the hidden ob¬ 
ject decreases. 



Figure 9: Mean Response Time vs. Letter Size in the 
identification limit test (Experiment 3) 

In accordance with the results of experiment 2, 
response time (RT) for pink noise is lower than RT 
for white noise even when resolution increases. This 
trend switches direction and indicates superiority of 
white noise only for letters no larger than 40 x 40 
pixels, i.e. when approaching the identification 
limits. For such small letters, the amount of correct 
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Depth\Size 

20 X 20 

40 X 40 

60 X 60 

80 X 80 

100 X 100 

Row Mean 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

1/5 

85.71 

0 

14.29 

14.29 

3.57 

82.14 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 

20.71 

0.71 

78.57 

1/6 

82.14 

10.7 

7.14 

32.14 

3.57 

64.29 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 

23.57 

2.86 

73.57 

1/7 

92.86 

0 

7.14 

14.29 

3.57 

82.14 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 

22.14 

0.71 

77.14 

Col. Mean 

86.9 

3.57 

9.52 

20.24 

3.57 

76.19 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 



/3 = 0 


Depth\Size 

20 X 20 

40 X 40 

60 X 60 

80 X 80 

100 X 100 

Row Mean 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

1/5 

89.29 

7.14 

3.57 

17.89 

0 

82.14 

0 

3.57 

96.43 

0 

0 

100 

0 

0 

100 

21.43 

2.14 

76.43 

1/6 

85.71 

3.57 

10.71 

35.71 

7.14 

57.14 

0 

3.57 

96.43 

0 

0 

100 

0 

0 

100 

24.29 

2.86 

72.86 

1/7 

89.29 

3.57 

7.14 

17.86 

0 

82.14 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 

22.14 

0.71 

77.14 

Col. Mean 

88.1 

4.76 

7.14 

23.81 

2.38 

73.81 

1.19 

2.38 

96.43 

0 

0 

100 

0 

0 

100 





Depth\Size 

20 X 20 

40 X 40 

60 X 60 

80 X 80 

100 X 100 

Row Mean 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

u 

m 

c 

1/5 

85.71 

7.14 

7.14 

10.7 

7.14 

82.14 

0 

0 

100 

3.57 

3.57 

92.86 

0 

0 

100 

20 

3.57 

76.43 

1/6 

96.43 

0 

3.57 

35.7 

0 

64.29 

3.57 

0 

96.43 

0 

0 

100 

0 

0 

100 

27.14 

0 

72.86 

1/7 

92.86 

0 

7.14 

7.14 

0 

92.86 

0 

0 

100 

0 

0 

100 

3.57 

0 

96.43 

20.71 

0 

79.29 

Col. Mean 

91.67 

2.38 

5.95 

17.9 

2.38 

79.76 

1.19 

0 

98.81 

1.19 

1.19 

97.62 

1.19 

0 

98.81 



/3=1 


Table 5: Rate [%] of Correct (c), Mistaken (m) and Undefinable (u) Selections per Letter Size and Relative 
Depth, for /3 = 0 (White Noise), P = \ and P = I (Pink Noise) 


answers to be accounted for when analyzing the RT 
is very small, and the observed difference was found 
to be statistically insignificant. 

We consider the identification limit for letter size 
in distinguishing between the letters P and B as the 
minimal size resulting in a correct rate of above 50%. 
For all the tested noise patterns, the letter size iden¬ 
tification limit lies between 20 x 20 and 40 x 40 pixels 
(meaning between 4 and 8 millimeters). Assuming 
the undefinable choice is equivalent to an equally dis¬ 
tributed guess, the identification limit for letter size 
is even lower and can be determined at 20 x 20 for 
all the noise patterns. 

4 Conclusions 

This work validated the prediction made by [5] that 
autostereograms created with pink noise patterns are 


more easily and correctly perceived than those gen¬ 
erated from other noise patterns. 

The first experiment tested how the choice of basic 
noise pattern affects the time and accuracy of per¬ 
ceiving smooth depth maps. It was found that in 
accordance with the model of [5] , recognizing smooth 
depth profiles is easier in autostereograms created 
from noise patterns with (3 > 1 than in those with 
/3 < 1. Hence the results indicate that autostere¬ 
ograms created with pink noise patterns exhibit sig¬ 
nificantly better depth lock-in behavior than those 
created with white noise and not significantly worse 
than those created with higher order noises for iden¬ 
tifying low resolution objects. 

The second experiment checked [5]’s prediction 
that recognition of high resolution objects (like let¬ 
ters) will be easy for white and pink noises {P < 1), 
while harder for smoother noise patterns (/3 > 1). 
It was found that when a mixture of low and high 
resolution is concerned, response time for pink noise 
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is significantly lower compared with both white and 
brown noise patterns. Compensating the contribu¬ 
tion of the smooth background to the measured re¬ 
sponse times based on the first experiment, our re¬ 
sults coincide with the prediction of Bruckstein et ah 

To further test detection performance for high reso¬ 
lution detail, the third experiment focused on discov¬ 
ering the limit of identifying fine details in the depth 
dimension. It should be pointed out that this experi¬ 
ment is preliminary and should be extended. Specifi¬ 
cally, an interactive adjustment of the letter size with 
an adaptive step is expected to give a more accu¬ 
rate estimation of the identification limit for differ¬ 
ent noise types. Nevertheless, the experiment clearly 
proves that even when approaching the high resolu¬ 
tion limits, pink noise is comparable to white noise 
in terms of the identification accuracy. 

The three experiments performed convincingly 
demonstrate the superiority of pink noise based au¬ 
tostereograms over other noise patterns in perception 
of both low and high resolution objects. Hence, our 
experimental results substantiate the model proposed 
by 0 as a good mathematical framework for analyz¬ 
ing autostereograms. 

We note in closing that besides the basic noise 
patterns used for creating the autostereogram, other 
factors may have an influence on the ease of depth 
perception in autostereograms. Future research may, 
for example, study the effect of using color compared 
with gray-level SIRDS, using specific colors over oth¬ 
ers, or using regular structured patterns rather than 
random noise. While our experiments didn’t reveal a 
clear dependency on the relative depth of the high- 
resolution detail for superimposed objects, this de¬ 
pendency should also be further studied using a dif¬ 
ferent or wider range of depths. 

We believe that autostereograms, beyond being an 
amazingly popular art form, can continue to help ad¬ 
vance the fields of 3D rendering, camouflage, visual 
physiology and games, and still have considerable 
stereo-vision research potential. By validating the 
theoretical model proposed by [5], its underlying as¬ 
sumptions are strengthened, contributing to a better 
understanding of stereopsis and the correspondence 
detection mechanism in the human visual system. 


A The Model 

The model proposed by [5] for the stereopsis match¬ 
ing process, was developed in order to understand 
what makes some autostereograms easier to per¬ 
ceive than others. We here present a simplified one¬ 
dimensional problem, referring to each image line in¬ 
dividually. 

Let us denote the depth profile by (p(x). A general 
point on the depth surface is projected onto pixel x 
for the left eye image II, and on pixel x for the right 
eye image Iji, as illustrated in Figure Having 
the two images fused into one, that means that for 
the two pixels to be matched as originating from the 
same point in space, they must have the same value, 
i.e. Il(x) = Ir{x) = I{x). 



Figure 10: Stereogram image pairs as seen by the left 
and right eyes 


The disparity A from x to x is given by 

^ ^ E<f{x) 

(p{x) + D 


( 1 ) 


where E is the distance between the eyes and D the 
viewing distance from the image plane. 

Assuming (p{x) is bounded, I over any interval 
[x^(p{x)) determines /(•)) completely. 

When observing an autostereogram image /, the 
disparities are to be decoded in order to enable per¬ 
ception of the hidden depth dimension. This pro¬ 
cess could be formulated by defining an abstract bi¬ 
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variate matching function A(x, x) G [0,1] that indi¬ 
cates how well I{x) locally matches I{x) (where a 
value of 1 indicates a perfect match). This function 
has a high ridge along the obvious match (planar in¬ 
terpretation) X = X, and additional matches associ¬ 
ated with the disparity that occurs in x = x (p(x), 
X = x-j-(p(x)-i-(p (x + etc. The ridges of the bi¬ 

variate matching function are illustrated in Figure pTj 



tion 


For a better depth lock-in, it is desired to be able 
to compute a A(x, x) that has a high ridge (i.e. 
deep basin of attraction) on the desired disparity 
X = X (p{x)^ and lower ridges for the other dis¬ 
parities, with valleys between the ridges that enable 
the interpretation mechanism to move between them 
reasonably easily. 

Figure [^demonstrates the possible behavior of the 
matching function for different choices of the basic 
pattern /. While the red curve has very sharp ridges 
on the “correct” disparities, it is difficult to “leave” 
the planar interpretation to “lock in” to the desired 
depth profile ridge. In the green curve, I leads to 
blurred ridges and no sharp depth perception. The 
blue curve represents the desired function, where the 
ridges are sharp for the correct disparities, yet the 
valleys are surmountable. 

Embracing the squared difference approach previ¬ 
ously suggested by [26] , the model of [5] proposes the 



matching function to be computed as follows: 

h{x,x)=f([I{x)-I{x)f) (2) 

where / is some smooth, monotonically decreasing 
function satisfying /(O) = 1 and /(O) < 0. 

The sharpness of ridges is here determined by the 
Laplacian, 


V ^ f 

d 

2 

d 

= 2/(0) 


+ 

-1 

s 

* 1^ 

_1 


n 2 


(3) 

implying that the first derivative of the image con¬ 
trols the shape of the matching function along the 
disparity line. In order to create high ridges needed 
for easy depth perception, one should therefore create 
an autostereogram with a repeating pattern that has 
high first-order derivatives in every direction. How¬ 
ever, we also want few accidental matches (that will 
obviously occur if I has finite range). This leads to¬ 
ward the consideration of using random patterns and 
assuming a matching function based on averaging: 


A(x,x) =/(E([J(a;)-/(x)]^)) (4) 


where I{x) is defined by extending the rnadom pat¬ 
tern selected on the basic interval [0, (/:^(0)). The aver¬ 
aging process causes false-match peaks to disappear, 
and so resolves the ambiguity of choosing from mul¬ 
tiple possible matches. 

The ridge sharpness in the stochastic case is given 
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by 


S{f) = Cf ^ G M, C constant): 


\/^A{x,x) = 2f{0)R"{0) 



+ 



(5) 

where B{x) is the deterministic “back-projection” 
into the interval [0,(/^(0)) and R{r) is the auto¬ 
correlation of the process I{x). 

This result suggests we can control the shape of 
the ridges by choosing a random process as the ba¬ 
sic pattern with high second derivative at the origin. 
A completely uncorrelated random process such as 
white noise seems ideal in that respect. However, the 
basin of attraction with a completely uncorrelated 
random process for the obvious match could be so 
deep that it will be hard to direct the visual system 
to the second ridge of depth encoding disparities. 

Here [5] adopts a coarse-to-fine model as in m, 
assuming that images presented to us are filtered 
by several low-pass, or band-pass, filters to create a 
pyramid of coarser and coarser images, and the per¬ 
ceptual system works its way from coarse to fine scale 
to perceive depth in various resolutions. So to per¬ 
ceive depth optimally, the random process we use as 
the basic pattern of the autostereograms needs be 
scale invariant. Furthermore, because we aim to di¬ 
rect the visual system from coarser to finer resolution, 
it is a desired attribute that the basins of attraction 
at each level will get narrower at the finer resolutions. 

Let us look at the autocorrelation R{r) of some 
random process with power spectral density S{f): 


POO 

Rir) = / ( 6 ) 

Jo 


<(0) = -47rCcT2 [” f-^df 
Jo 


-AttC 

3-P 


( 9 ) 


for /d 7 ^ 3. 

It is readily observed that for /3 = 1 (pink noise), 
i?^(0) is independent of a, meaning the peaks of the 
matching function have constant normalized width in 
scale-space, giving the desired scale invariance prop¬ 
erties. From an unnormalized point of view, the 
peaks get narrower with scale from coarse to fine, 
as intended. 

Therefore the model predicts that pink noise leads 
to easy depth lock-in across scales with excellent 
detail perception 


To demonstrate this concept using our generated 
autostereograms, we averaged the result of [I{x) — 
I{x)]‘^ over a chosen subset of successive image lines in 
several images generated with the same depth profile 
and the same noise type. The matching function is 
then calculated as A(x,x) = / (^E (^[I{x) — 
for f{z) = with A = 0.001 (chosen empirically). 

The result of those calculations for autostere¬ 
ograms created with white, pink and brown noise pat¬ 
terns are presented in Figure We also present the 
counter-diagonal of the matching function to show a 
one-dimensional view of the basins of attraction. 


= -47r fS{f)e^^-f-df (7) 

If we use a low pass filter with a cut-off frequency 
of /o = and normalize r by a, we get 

= -4,,^ l' f saw-'!’it 

(8) 

For noise patterns with power spectra of the form 


It can be observed that for white noise, the match¬ 
ing function has very sharp ridges, but moving be¬ 
tween ridges is quite difficult due to the difficulty in 
getting out from the planar interpretation ridge. For 
brown noise, the matching function has wider ridges 
and smooth valleys, which allows for an easier move¬ 
ment of interpretation between ridges, but may mean 
a blurry image. Pink noise represents the balance be¬ 
tween them, having sharp ridges and valleys that are 
surmountable. 
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Figure 13: Matching function A(x, x) (left) and basins of attraction (right) for (a)|(b) White noise, (c)|(d) 


Pink noise, [(^(f) I Brown noise. The basins of attraction are displayed as a function of the pixel displacement 
with respect to the planar interpretation x 


: X. 
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